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Abstract 

A general novel approach mapping discrete, combinatorial, graph-theoretic problems onto "phys- 
ical" models - namely n simplexes in n — 1 dimensions - is applied to the graph equivalence problem. 
It is shown to solve this long standing problem in polynomial, short, time. 
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INTRODUCTION 



A graph G consists of n vertices Vi connected by edges Eij. It is described by a connec- 
tivity matrix C with: 

Cij = Cji = 0,1 {for (dis) connected Vi and Vj i j = 1, ■ ■ ■ ,n) 

Cn = (1) 

Vertex relabelhng i p{i) leaves G invariant but changes C according to 

C — >C' = P^CP (2) 

with P an orthogonal matrix with only one non-zero element in each row i and column 
j = p{i), which represents the above permutation 

P = %,p(0) (3) 

The graph equivalence problem is the following: "Given C and C, how can we decide, in 
time which is polynomial in n, if both correspond to the same topological graph G or to 
different graphs?, or stated differently, does a permutation matrix P for which Eq.(|^) holds 
exist, and what is this P matrix?" 

Exhaustive testing of all n\ permutation is impractical even for moderate n. A more sys- 
tematic search of P performs just those transpositions which enhance an "overlap" function 
say 

trC''C' = Y.C.,C:, (4) 

ij 

However the changes in C (and in trC^C) due to any permutation is finite. There is no 
algorithm for systematically enhancing trC^C, as subsequent transpositions may undo the 
improvement due to previous permutations. 

Our basic suggestion is: Instead of using discrete, large changes of say just two elements 
in a transposition [i ^ j), we modify, in each step, all elements by small amounts. 

Such "continuous" changes seem impossible: in the strict formal approach there are no 
"continuous permutations" . 
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THE DYNAMICAL MODEL FOR SIMPLEX DISTORTION 



We use a symmetric n simplex (in n — 1 dimensions) to represent our graph. The "ab- 
stract" vertices Vi oi G (or C) are mapped into the geometrical vertices r j , z = 1, ■ ■ ■ ,n 
of the simplex. The symmetric configuration with all \ri — rj\ i j = 1, ■ ■ ■ ,n, equal, is 
the starting point of our algorithms. 

The motion generated by the dynamics, was designed to distort the simplex by shift- 
ing its vertices from the symmetric initial positions. The distorted simplex then reveals 
characteristic features of the graph G []T|. 

The original aim of the distortion algorithm was to find groups of vertices in G with higher 
than average mutual connectivity, and asses the distances between the various clusters in 
the graph. 

To this end attractive (repulsive) interactions were introduced between fictions point ob- 
jects at fi and rj when the corresponding vertices Vi and Vj are connected (or disconnected) 
in G. We use first order "Aristotelian" dynamics: 

= mit)), (5) 

with forces which derive from potentials: 

^. = -V(.i){?7[rl,---,f„]}, (6) 

U = Y: Uailn - + E Urdu - - Q,), (7) 

i>j i>j 

Ua{r) {Ur{r)) are attractive (repulsive) pair-wise potentials. 

By a proper tuning of the latter- which can even be modified as a function of "time" - we 
can physically cluster at separate locations groups of points representing strongly (internally) 
connected clusters in the graph G. 

To avoid collapse towards the origin (or a "run- away" to infinity) if Ua (or Ur) dominates, 
we force rj(t) to stay, at all times, on the unit sphere: 

\fi{t)\ = l all t>0. (8) 

The graph characterization (G.C.) and graph equivalence (G.E.P.) problems are very closely 
connected. If we could find (in polynomial number of steps!) a set of real numbers 
Ply P2, - ' ' y Pm that would Completely characterize a graph G then the G.E.P is readily solved. 
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All we need to do is to compute for C (C) these numbers {pk} ({Pfc}), order the pk and p'^ 
sets separately and compare them. 

The set of eigenvalues (Ai, ■ ■ ■ , A„) of the connectivity matrix are certainly invariant under 
relabelling. While this set encodes a rich body of information of graph theoretic interest, it 
fails to completely characterize graphs 0]. 

An alternative and natural simple variable helping characterize connectivity matrices is 
the mutual entropy (see, for example [§]). Suppose the connectivity matrix C has been 
normalized so that 

n 

E = 1. (9) 

Pi = J2] Cij could then be considered as the probability that Vi and Vj are connected. The 
corresponding entropy 

n 

H{row) = -Y.Pi\ogPi, (10) 
i=i 

could be considered as a measure of the uncertainty of the rows connection for the given 
network. The amount of uncertainty for the connection of the column nodes given that the 
row nodes are connected is 

n 

H{column\row) = — ^ Cij \ogCij — H{row). (11) 

As a result the amount of mutual information gained via the given connectivity of the 
network is 

n 

1(C) = H{row) + H{column) - H{column\row), = ^ C^j log (dj/PiPj). (12) 

where 

n 

H{column\row) = — ^ Cij log {Cij). (13) 

Due to the double summation and the symmetry of the connectivity matrix /(C) does not 
depend on the vertex relabelling and is a permutation invariant measure for the connectivity 
matrix. 

Calculations of the mutual entropy for two connectivity matrices provides an easy way 
to distinguish between these corresponding different graphs. If, however, the entropies are 
the same, the more detailed approach below is used. Amusingly we found that the entropy 
is already sufficient to distinguish between the lowest cospectral graphs (see, for example 0] 
and references therein). 
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The distances between the various vertices 

rijit) = m) ~ r,it)\ (14) 

vary in our original algorithm as a function of time away from the original common value: 

= \rM - rj{0)\ = a all i ^ j = I, ■ ■ ■ ,n (15) 

Also in identical simulations of the dynamical evolution, the sets of relative distances 
computed for C and C", should be the same if C and C are equivalent: 

= {r[,{t)] (16) 

One permutation of n elements (namely that which brings via Eqs.(g) and (H) C into C) 
should yield: 

K{r)it) - r,^,){t)\ = \r',{t) - r'j{t)\ (17) 



It is straightforward to verify (p!6| ) and then using (|T^ recover the permutation i p{i)- 
In essence the idea of the present algorithm is to use the distortion of the simplex S{0) — *■ 

S(t) {i.e. rj(0) ^i(t)} generated via the dynamics of (repulsion) attraction between 

(dis)connected vertices in G to bring out an "intrinsic shape" of the graph. 

Initially all vertices were at equal distances[^. All the information pertaining to the 

graph was encoded in the interactions of Eq.(|^). 

After enough evolution steps, each vertex moves appreciably away, namely by 

\fi{t) - f,{0)\ ^ a/2 (18) 

from its initial position. The information on the specific graph G reflects in the geometrical 
shape of S, i.e. the set of distances, 

\fi{t) - fj{t)\ z^j = l,...,n. (19) 

Vertices which are near in a graph theoretic sense, namely for which there are many, short, 
connecting paths in the graph move closer together. (A short path consists of a small # of 
consecutive edges which starts at Vi say and terminates at Vj). Like wise vertices which are 
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far in a graph theoretic sense i.e. have fewer and longer connecting paths will tend to move 
further away. 

In our earlier work0 we sought to identify "clusters in the graphs" namely have the 
points corresponding to a subset {Cj} of vertices in the graph which have relatively strong 
mutual, internal, connectivity, collapse to a single point. 

For the present purpose we need (and should!) not pursue the evolution that far, as by 
then the graph simplifies and some of the inter-cluster details are lost. Rather we need to 
stop "Half- Way": after Eq. ([T9D holds and yet no cluster has completely collapsed. 

Note that in n — 1 dimensions all the n{n — l)/2 distances \ri[t) — fj (t) | are independent, 
apart from triangular inequalities of the form 

I fi (t) - fj {t)\ < \fi{t) - n{t)\ + \ fk {t) - r. (t) I . (20) 

Jointly these distances specify the geometric shape of S. 

The mapping of the n{n — l)/2 bits of information: Cij = or 1, via our dynamic 
evolution, into the set of n{n — l)/2 distances, is highly non-linear. The fact that we have 
n{n — l)/2 distances (rather than just n eigenvalues) makes the former more likely to specify 
the graphs. 

Further we note that the time t when the comparisons are made and the attractive and 
repulsive interactions in Eq.(|^) above are free parameters and functions. Hence we can 
repeat the above graph comparisons for many values and/or many functions Ur{p), Ua{p), 
making the significance of a successful match extremely high. 

If many of the rij{t) {and r'-{t)} are degenerate our ability to resolve graphs will be 
diminished. However such degeneracies must stem from some symmetries in the graphs and 
corresponding connectivity matrices. Once these symmetries are identified, the number of 
independent Cij (or C-^) and the task of comparing them will be accordingly reduced. 

We apply the above approach below, demonstrating its power and versatility. 
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THE CONVERGENCE AND COMPLEXITY OF THE DISCRETE MODELINGS 
OF THE DYNAMICAL EVOLUTIONS 



We follow the dynamics of the vertex shifts in Eq.(|^) by discretizing the first order 
equations: 

ri{t + 5) = n{t) + -F,{f,{t)) (21) 

with 6 a small time increment. 

Since r^, Fj, are n — 1 dimensional vectors Eq. ([2T|) represents O(n^) equations for the 
relevant components. Each force component Fi^ is a sum of f « force components with Vi 
the valency of the vertex Vi i.e. the ^ of vertices connected to it. Hence each step in (26) 
involves r^vjl calculations with 

v = Y.v,ln (22) 

the average valency in the graph. 

Let us assume that we need to repeat the process of iterating the dynamics namely (26) 
or (27) for s steps in order to achieve the goal(s) of the algorithm(s). These goals vary for 
the various problems of interest. For cluster identification we need the points representing 
clusters in the graph to physically converge into definable separate regions. 

For graph characterizations and comparison we need a fewer number of steps, sufficient 
to make the distances Vijif) vary considerably away from their original common value. 

The total number of computations involved is = 0{r?s) if v is finite and n independent 
OT N = O(n^s) for the extreme case when v ^ n. For not to be polynomial in n we need 
that s will grow faster than any power of n. 

In principal one can envision many types of chaotic dynamical evolution where such large 
number of steps is indeed required. 

This is not the case for the first order equations considered here: 

where the system consistently moves, along the steepest descent, to a minimum of U, the 
potential energy. 

If we have a complicated "energy landscape" the system can be trapped in any one of 
the many local minima, a feature which accounts for the difficulty of protein folding[0. 
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neural nets and spin glass problems P]. The need to keep the same deterministic evolution 
for S and S' representing G and G' in the first "distortion" algorithm, excludes in our case- 
the possibility of introducing some stochastic noise to extricate the system from a local 
minimum. 

Fortunately our problem does not allow for many minima. Thus let us fix the locations 
of all Tj i = 1, ■ ■ ■ ,n — 1 except fvi = r. The velocity is dictated by 

= E Cn.UA{\r- n\) + (1 - GnjUnilf- f,\) (24) 

i=l 

Assume we have some local equilibrium at tq. Locally, in the neighborhood of ro, we can 
use the variables pi = \f — fi\ z = 1, ■ ■ ■ , n — 1, instead of Xi ■ ■ ■ the n — 1 Cartesian 
coordinates of r. The conditions for an extremum Vf/(r) \f=f^ then require that 

^f^^(P.)l,,.,H=0; or |-f/«(P.)l,,.,(^)=0 (25) 
Thus for generic monotonic Ua, Ur-, we have no extrema inside the region. 
An absolute minimum obtained at the boundary. 

APPLICATIONS OF THE METHOD 

To demonstrate the power of our approach we considered a graph with 100 vertices each 
of which is randomly connected to seven others. The corresponding connectivity matrix G 
is shown in Fig.(|l])). Random reshuffling transforms the G into the matrix B of Fig.(^ 
Next we applied our algorithm using a combination of attractive and repulsive forces in 
n — 1 = 99 dimensional space. The vertices of the 100-simplex were allowed to move under 
the influence of the forces on the 9 8- dimensional hyper-sphere in 9 9- dimensions. After a 
number of steps we analyzed the distances between pairs the vertices of the two simplexes. 
We found perfect correspondence between the distance matrices. We also readily show the 
permutation matrix which maps one distance matrix on to the another. Applying the latter 
to the matrix B reproduces exactly the original connectivity matrix G (Fig.(|I])). 



S.Nussinov would like to thank Zohar nussinov for a crucial comment regarding the 
advantage of going to higher dimensions to overcome frustrations and alleviate constraints. 
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FIG. 1: Connectivity matrix for 100 vertices graph with 7 random connections for each vertex. 
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of Boolean logic logic requirments via adiabatic changing of the Hamiltonian. 
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FIG. 2: Reshuffled connectivity matrix. 
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